A Statistical Learning Perspective of Genetic Programming
نویسندگان
چکیده
This paper proposes a theoretical analysis of Genetic Programming (GP) from the perspective of statistical learning theory, a well grounded mathematical toolbox for machine learning. By computing the Vapnik-Chervonenkis dimension of the family of programs that can be inferred by a specific setting of GP, it is proved that a parsimonious fitness ensures universal consistency. This means that the empirical error minimization allows convergence to the best possible error when the number of test cases goes to infinity. However, it is also proved that the standard method consisting in putting a hard limit on the program size still results in programs of infinitely increasing size in function of their accuracy. It is also shown that cross-validation or hold-out for choosing the complexity level that optimizes the error rate in generalization also leads to bloat. So a more complicated modification of the fitness is proposed in order to avoid unnecessary bloat while nevertheless preserving universal consistency.
منابع مشابه
A New Correlation Based on Multi-Gene Genetic Programming for Predicting the Sweet Natural Gas Compressibility Factor
Gas compressibility factor (z-factor) is an important parameter widely applied in petroleum and chemical engineering. Experimental measurements, equations of state (EOSs) and empirical correlations are the most common sources in z-factor calculations. However, these methods have serious limitations such as being time-consuming as well as those from a computational point of view, like instabilit...
متن کاملModeling Ghotour-Chai River’s Rainfall-Runoff process by Genetic Programming
Considering the importance of water and computing the amount of rainfall runoff resulted from precipitation in recent decades, using appropriate methods for predicting the amount of runoff from rainfall date has been really essential. Rainfall-runoff models are used to estimate runoff generated from precipitation in the catchment area. Rainfall-runoff process is totally a non-linear phenomenon....
متن کاملAn Application of Genetic Network Programming Model for Pricing of Basket Default Swaps (BDS)
The credit derivatives market has experienced remarkable growth over the past decade. As such, there is a growing interest in tools for pricing of the most prominent credit derivative, the credit default swap (CDS). In this paper, we propose a heuristic algorithm for pricing of basket default swaps (BDS). For this purpose, genetic network programming (GNP), which is one of the recent evolutiona...
متن کاملTwo-stage fuzzy-stochastic programming for parallel machine scheduling problem with machine deterioration and operator learning effect
This paper deals with the determination of machine numbers and production schedules in manufacturing environments. In this line, a two-stage fuzzy stochastic programming model is discussed with fuzzy processing times where both deterioration and learning effects are evaluated simultaneously. The first stage focuses on the type and number of machines in order to minimize the total costs associat...
متن کاملDAMAGE AND PLASTICITY CONSTANTS OF CONVENTIONAL AND HIGH-STRENGTH CONCRETE PART II: STATISTICAL EQUATION DEVELOPMENT USING GENETIC PROGRAMMING
Several researchers have proved that the constitutive models of concrete based on combination of continuum damage and plasticity theories are able to reproduce the major aspects of concrete behavior. A problem of such damage-plasticity models is associated with the material constants which are needed to be determined before using the model. These constants are in fact the connectors of constitu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009